Moonwalk: NRE Optimization in ASIC Clouds or, accelerators will use old silicon
نویسندگان
چکیده
Cloud services are becoming increasingly globalized and data-center workloads are expanding exponentially. GPU and FPGA-based clouds have illustrated improvements in power and performance by accelerating compute-intensive workloads. ASIC-based clouds are a promising way to optimize the Total Cost of Ownership (TCO) of a given datacenter computation (e.g. YouTube transcoding) by reducing both energy consumption and marginal computation cost. The feasibility of an ASIC Cloud for a particular application is directly gated by the ability to manage the NonRecurring Engineering (NRE) costs of designing and fabricating the ASIC, so that it is significantly lower (e.g. 2×) than the TCO of the best available alternative. In this paper, we show that technology node selection is a major tool for managing ASIC Cloud NRE, and allows the designer to trade off an accelerator’s excess energy efficiency and cost performance for lower total cost. We explore NRE and cross-technology optimization of ASIC Clouds for four different applications: Bitcoin mining, YouTube-style video transcoding, Litecoin, and Deep Learning. We address these challenges and show large reductions in the NRE, potentially enabling ASIC Clouds to address a wider variety of datacenter workloads. Our results suggest that advanced nodes like 16nm will lead to sub-optimal TCO for many workloads, and that use of older nodes like 65nm can enable a greater diversity of ASIC Clouds.
منابع مشابه
Low Overhead Memory Subsystem Design for a Multicore Parallel DSP Processor
The physical scaling following Moore’s law is saturated while the requirement on computing keeps growing. The gain from improving silicon technology is only the shrinking of the silicon area, and the speedpower scaling has almost stopped in the last two years. It calls for new parallel computing architectures and new parallel programming methods. Traditional ASIC (Application Specific Integrate...
متن کاملCo-processor approach to accelerating multimedia applications
Microprocessor cores are commonly used as the main component in today's embedded systems. They are designed to elaborate general purpose code, thus they usually cannot provide the necessary performance needed to support all the functionalities required by performance-critical applications. For this reason, together with general purpose processors a series of dedicated components, namely acceler...
متن کاملThe Microprocessor is no more General Purpose : why Future Reconfigurable Platforms will win
The paper is a plaidoyer for a radical methodological change in R&D of dynamically reconfigurable circuits. The paper illustrates, that the current main stream approach based on placement and routing is not very likely to obtain the area-efficiency and throughput needed to cope with the emerging crisis cost of future silicon technology generations. The proposed changes include both: architectur...
متن کاملSilicon Implementation of SHA-3 Final Round Candidates: BLAKE, Grøstl, JH, Keccak and Skein
Hardware implementation quality is an important factor in selecting the NIST SHA-3 competition finalists. However, a comprehensive methodology to benchmark five final round SHA-3 candidates in ASIC is challenging. Many factors need to be considered, including application scenarios, target technologies and optimization goals. This work describes detailed steps in the silicon implementation of a ...
متن کاملSilicon Implementation of SHA-3 Finalists: BLAKE, Grøstl, JH, Keccak and Skein
Hardware implementation quality is an important factor in selecting the NIST SHA-3 competition finalists. However, a comprehensive methodology to benchmark five final round SHA-3 candidates in ASIC is challenging. Many factors need to be considered, including application scenarios, target technologies and optimization goals. This work describes detailed steps in the silicon implementation of a ...
متن کامل